Word Sense Induction for Novel Sense Detection
نویسندگان
چکیده
We apply topic modelling to automatically induce word senses of a target word, and demonstrate that our word sense induction method can be used to automatically detect words with emergent novel senses, as well as token occurrences of those senses. We start by exploring the utility of standard topic models for word sense induction (WSI), with a pre-determined number of topics (=senses). We next demonstrate that a non-parametric formulation that learns an appropriate number of senses per word actually performs better at the WSI task. We go on to establish state-of-the-art results over two WSI datasets, and apply the proposed model to a novel sense detection task.
منابع مشابه
Automatic Biomedical Term Polysemy Detection
Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a nov...
متن کاملFrom the Culinary to the Political Meaning of "quenelle" : Using Topic Models For Identifying Novel Senses (De la quenelle culinaire à la quenelle politique : identification de changements sémantiques à l'aide des Topic Models) [in French]
In this study we explore topic modeling for the automatic detection of new senses of known words. We apply methods developed in previous work for English (Lau et al., 2012, 2014) on a recent case of new word sense induction in French, namely the appearence of the new meaning of gesture for the word « quenelle ». Our experiments illustrate the potential of this approach at learning word senses, ...
متن کاملWord Sense Induction by Community Detection
Word Sense Induction (WSI) is an unsupervised approach for learning the multiple senses of a word. Graph-based approaches to WSI frequently represent word co-occurrence as a graph and use the statistical properties of the graph to identify the senses. We reinterpret graph-based WSI as community detection, a well studied problem in network science. The relations in the co-occurrence graph give r...
متن کاملWord sense induction using word embeddings and community detection in complex networks
Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domai...
متن کاملPrédiction de la polysémie pour un terme biomédical
Polysemy is the capacity for a term to have multiple meanings. Polysemy prediction is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term, as well as for Information Extraction (IE) systems. In addition, the polysemy detection is important for building and enriching terminologies and ontologies. In this paper, we present a novel approach to detect if ...
متن کامل